AI Hallucination Detection and Mitigation Prompt Injection Attack on AI

Avneet Singh Sehre; Shahid Khan

Title:
AI Hallucination Detection and Mitigation Prompt Injection Attack on AI

Authors:
Shahid Khan | Avneet Singh Sehre

Cite This Article :

Shahid Khan | Avneet Singh Sehre "AI Hallucination Detection and Mitigation Prompt Injection Attack on AI" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Recent Advances in Computer Applications and Information Technology, March 2026, pp.98-102, URL: https://www.ijtsrd.com/papers/ijtsrd101287.pdf

Download

Abstract :
Large Language Models (LLMs) have shown impressive skills in understanding and generating natural language. However, as they gain popularity, important security and reliability issues have arisen. Two key challenges include AI hallucinations, where models create incorrect or made-up information, and prompt injection attacks, where malicious inputs trick the model into giving unwanted or harmful outputs. These weaknesses pose serious risks in areas like education, cybersecurity, healthcare, and automated decision-making systems. This paper offers a detailed study of AI hallucinations and prompt injection attacks, followed by the development of a hybrid detection and mitigation framework. The proposed framework features a multi-layer structure that includes input monitoring, semantic risk analysis, hallucination detection through fact-consistency checks, and a mitigation engine that works on response regeneration and prompt cleaning. The system also adds a risk-scoring method to categorise outputs based on their chances of being manipulated or hallucinated. Through experiments and simulations, the proposed method shows better detection accuracy and fewer incorrect outputs compared to standard language model responses. The study underscores the need to incorporate security-aware methods directly into AI workflows and lays the groundwork for creating safer and more reliable AI systems. Large Language Models (LLMs) have quickly become essential in modern AI applications because they can create fluent and context-sensitive responses. Despite their strengths, these systems face serious reliability and security challenges. Two major concerns are AI hallucinations, where the model produces incorrect or made-up information, and prompt injection attacks, where harmful inputs manipulate the model to generate unintended or harmful outputs. These problems decrease user trust and create serious risks in fields that rely on accurate and secure AI-driven decision-making. This research provides a structured analysis of both hallucination behaviour and prompt injection weaknesses, followed by the design of a hybrid detection and mitigation framework. The proposed approach features a layered monitoring system that checks user inputs for harmful patterns and verifies generated outputs for factual accuracy. By incorporating risk scoring and mitigation strategies within a single structure, the framework improves both reliability and security while keeping the core language model unchanged. Experimental evaluations show better detection results and reduced vulnerability compared to standard systems.

Keywords :
AI Hallucination; Prompt Injection Attack; Large Language Models; AI Security; Natural Language Processing; Adversarial Prompts; Detection Framework; Mitigation Strategy; Retrieval-augmented generation; Risk Scoring Engine; Retrieval-Augmented Generation for Knowledge.

Publication Details:

Unique Identification Number : IJTSRD101287

Published In : Special Issue | Recent Advances in Computer Applications and Information Technology, March 2026

Page Number(s) : 98-102

Publisher Name : IJTSRD | www.ijtsrd.com | E-ISSN 2456-6470

Copyright © 2019 by author(s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0)

About IJTSRD
Indexing

International Journal of Trend in Scientific Research and Development - IJTSRD having online ISSN 2456-6470. IJTSRD is a leading Open Access, Peer-Reviewed International Journal which provides rapid publication of your research articles and aims to promote the theory and practice along with knowledge sharing between researchers, developers, engineers, students, and practitioners working in and around the world in many areas like Sciences, Technology, Innovation, Engineering, Agriculture, Management and many more and it is recommended by all Universities, review articles and short communications in all subjects. IJTSRD running an International Journal who are proving quality publication of peer reviewed and refereed international journals from diverse fields that emphasizes new research, development and their applications. IJTSRD provides an online access to exchange your research work, technical notes & surveying results among professionals throughout the world in e-journals. IJTSRD is a fastest growing and dynamic professional organization. The aim of this organization is to provide access not only to world class research resources, but through its professionals aim to bring in a significant transformation in the real of open access journals and online publishing.